Optimal Suffix Tree Construction with Large Alphabets
نویسنده
چکیده
The suux tree of a string is the fundamental data structure of combinatorial pattern matching. Weiner Wei73], who introduced the data structure, gave an O(n) time algorithm algorithm for building the suux tree of an n character string drawn from a constant size alphabet. In the comparison model, there is a trivial (n log n) time lower bound based on sorting, and Weiner's algorithm matches this bound trivially. Since Weiner's paper, the main open question has been how to deal with integer alphabets. There is no super-linear lower bound, and the fastest known algorithm was the O(n log n) time comparison based algorithm. We settle this open problem by closing the gap: we build suux trees in linear time for integer alphabet.
منابع مشابه
Constructing the Suffix Tree of a Tree with a Large Alphabet
The problem of constructing the suffix tree of a tree is a generalization of the problem of constructing the suffix tree of a string. It has many applications, such as in minimizing the size of sequential transducers and in tree pattern matching. The best-known algorithm for this problem is Breslauer’s O(n log |Σ|) time algorithm where n is the size of the CS-tree and |Σ| is the alphabet size, ...
متن کاملSuffix Tree
SYNONYMS Compact suffix trie DEFINITION The suffix tree S(y) of a non-empty string y of length n is a compact trie representing all the suffixes of the string. The suffix tree of y is defined by the following properties: All branches of S(y) are labeled by all suffixes of y. • • Edges of S(y) are labeled by strings. • Internal nodes of S(y) have at least two children. • Edges outgoing an intern...
متن کاملThe Virtual Suffix Tree: An Efficient Data Structure for Suffix Trees and Suffix Arrays
We introduce the VST (virtual suffix tree), an efficient data structure for suffix trees and suffix arrays. Starting from the suffix array, we construct the suffix tree, from which we derive the virtual suffix tree. The VST provides the same functionality as the suffix tree, including suffix links, but at a much smaller space requirement. It has the same linear time construction even for large ...
متن کاملSimple Linear Work Suffix Array Construction
A suffix array represents the suffixes of a string in sorted order. Being a simpler and more compact alternative to suffix trees, it is an important tool for full text indexing and other string processing tasks. We introduce the skew algorithm for suffix array construction over integer alphabets that can be implemented to run in linear time using integer sorting as its only nontrivial subroutin...
متن کاملLinear-Time Construction of Suffix Arrays
The time complexity of suffix tree construction has been shown to be equivalent to that of sorting: O(n) for a constant-size alphabet or an integer alphabet and O(n logn) for a general alphabet. However, previous algorithms for constructing suffix arrays have the time complexity of O(n logn) even for a constant-size alphabet. In this paper we present a linear-time algorithm to construct suffix ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997